An investigation of temporally varying weight regression for noise robust speech recognition

نویسندگان

Shilin Liu

Khe Chai Sim

چکیده

In this paper, recently proposed Temporally Varying Weight Regression (TVWR) is investigated in two ways for noise robust speech recognition. Firstly, since typical model compensation approaches assume that the noise feature is independent and identically distributed, non-stationary noise environment can be poorly compensated using conventional model compensation approaches in the standard Hidden Markov Model (HMM) framework. TVWR, however, maintains both the basic HMM structure and additional time-varying property, therefore, model compensation for TVWR is proposed such that i.i.d. noise assumption can be relaxed. Secondly, although Noise Adaptive Training NAT has been proposed to optimize the “pseudoclean” HMM model for a better performance by maximizing the likelihood of multi-condition data, NAT heavily depends on the simplicity of Vector Taylor Series (VTS) formulation. Hence, other advanced compensation approaches, such as Trajectorybased Parallel Model Combination (TPMC), have difficulties benefiting from this powerful training schema. This paper exploits the time-varying attribute of TVWR to approximate NAT such that any compensation technique can be applied during noise adaptive training. Experiments on the Aurora 4 corpus show that significant improvements over the standard HMM or NAT system can be obtained by compensating TVWR either trained using clean data or adaptively trained using multicondition data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Joint adaptation and adaptive training of TVWR for robust automatic speech recognition

Context-dependent Deep Neural Network has obtained consistent and significant improvements over the Gaussian Mixture Model (GMM) based systems for various speech recognition tasks. However, since DNN is discriminatively trained, it is more sensitive to label errors and is not reliable for unsupervised adaptation. Moreover, DNN parameters do not have a clear and meaningful interpretation, theref...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Parameter clustering for temporally varying weight regression for automatic speech recognition

Recently, an implicit trajectory model using temporally varying weight regression (TVWR) was proposed and achieved promising gains using ML training criteria. In the original TVWR, each component weight is modelled as a constrained linear regression function with respect to the monophone posterior feature. Due to the high dimensionality of the posterior feature, many free parameters were introd...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

An investigation of temporally varying weight regression for noise robust speech recognition

نویسندگان

چکیده

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

Joint adaptation and adaptive training of TVWR for robust automatic speech recognition

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Parameter clustering for temporally varying weight regression for automatic speech recognition

عنوان ژورنال:

اشتراک گذاری